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Abstract 

The fields of compressed sensing (CS) and matrix completion have shown that high-dimensional 
signals with sparse or low-rank structure can be effectively projected into a low-dimensional space 
(for efficient acquisition or processing) when the projection operator achieves a stable embedding of 
the data by satisfying the Restricted Isometry Property (RIP). It has also been shown that such stable 
embeddings can be achieved for general Riemannian submanifolds when random orthoprojectors are used 
for dimensionality reduction. Due to computational costs and system constraints, the CS community 
has recently explored the RIP for structured random matrices (e.g., random convolutions, localized 
measurements, deterministic constructions). The main contribution of this paper is to show that any 
matrix satisfying the RIP (i.e., providing a stable embedding for sparse signals) can be used to con- 
struct a stable embedding for manifold-modeled signals by randomizing the column signs and paying 
reasonable additional factors in the number of measurements. We demonstrate this result with several 
new constructions for stable manifold embeddings using structured matrices. This result allows advances 
in efficient projection schemes for sparse signals to be immediately applied to manifold signal models. 

I. Introduction 

Much of modern signal processing rests on the observation that many high-dimensional signals of 
interest in fact have an intrinsic low-dimensional structure that can be described with a geometric model. 
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For example, sparse signals live on a union of low-dimensional subspaces within an ambient high- 
dimensional signal space [2], while parametric signals and certain non-parametric signal collections are 
constrained to live on (or near) low-dimensional manifolds [3,4]. While this low-dimensional structure 
can be exploited to great effect in signal processing applications, the high-dimensionality of the ambient 
signal space can severely complicate the acquisition and processing of the data [5]. To partially address this 
issue, several recent results have shown that compressive linear operators can provide stable embeddings 
that preserve the geometry of the signal model (i.e., preserve pairwise points between signals) in a 
lower-dimensional space. 

Much of the work on compressive linear operators has come in the field of compressed sensing (CS), 
where it is known that certain randomized compressive matrix constructions will satisfy a condition known 
as the Restricted Isometry Property (RIP) [6] with high probability. The RIP guarantees that a matrix 
will approximately preserve distances between all pairs of sparse signals, therefore stably embedding the 
signal model by preserving the geometric structure of the union of subspaces in the compressed (i.e., 
measurement) space. The RIP is a sufficient condition to guarantee robust recovery of sparse signals from 
their measurements via solving a computationally tractable i\ -minimization program. In a similar vein, 
an equivalent formulation of the RIP for preserving distances between low-rank matrices also leads to 
matrix recovery guarantees from underdetermined linear measurements [7]. 

The notion of a stable embedding, as quantified in the RIP, has also been extended to describe linear 
operators acting on signals living on a low-dimensional manifold [8,9]. For example, it has been shown 
that an undersampled random orthoprojector can stably embed a manifold from a high-dimensional space 
into a lower-dimensional space [8,9]. Such stable embeddings are valuable because they ensure that 
key properties of the manifold are retained in the low-dimensional measurement space where processing 
is much more computationally efficient. In particular, a stable embedding is a sufficient condition for 
guarantees on our ability to recover the original signal via tractable recovery algorithms [10] and for 
performance guarantees on data processing or inference algorithms in the measurement space [11]. 
Moreover, a stable embedding also guarantees that manifold learning algorithms (e.g., Isomap [12]) 
can be applied in the low-dimensional measurement space nearly as accurately as in the original signal 
space [13]. 

Recently, the CS community has turned to investigating structured measurement systems because 
unstructured systems (i.e., those corresponding to i.i.d. random matrices or random orthoprojectors that are 
classically analyzed in the CS literature) may be impractical due to memory constraints, computational 
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costs, or limitations in the data acquisition architecture. Several structured CS systems (e.g., random 
convolution systems described by partial Toeplitz [14] and circulant matrices [15, 16], localized sensing 
systems described by randomized block diagonal matrices [17], and certain deterministic matrix construc- 
tions [18]) have been shown to satisfy the RIP while requiring (at least analytically) a small increase in 
the number of measurements beyond what is needed for an unstructured random matrix. While concerns 
about the practicality of unstructured measurements also apply to systems acquiring manifold-modeled 
signals, the existing stable embedding results for structured matrices apply only to sparse signal models. 

The main contribution of this paper is to demonstrate that any matrix satisfying the RIP for sparse 
signals can be used to generate a stable embedding of a manifold by randomizing the column signs of 
the matrix. Our main theorem statement gives an explicit recipe for using the RIP guarantee of a matrix 
to determine the number of measurements sufficient to guarantee (with a prescribed probability) a stable 
manifold embedding of a specified conditioning. This result generalizes the existing stable manifold 
embedding results by paying a reasonable penalty in the number of measurements to accommodate any 
matrix for which the RIP is established. As practical examples, we compute the number of measure- 
ments sufficient for stable manifold embeddings when using measurement systems constructed from 
several structured matrices studied in the CS literature, including subsampled Fourier transforms, random 
convolution matrices, block diagonal matrices, and certain deterministic matrices. Our work rests on a 
recent result [19] showing that when the columns of an RIP matrix are modulated by a random sign 
sequence, the matrix will obey a form of the Johnson-Lindenstrauss (JL) lemma [20] and can therefore 
provide a stable embedding of an arbitrary finite point cloud. Following similar arguments to [8], we 
extend the finite JL result to all points living on a manifold. 

II. Background 

A. Stable Embeddings 

When M < N, a compressive linear operator <E> 6 R MxAr possesses a nullspace of dimension at least 
N — M. Therefore, distinct signals may be mapped onto, or close to, the same measurement by the 
operator if their difference falls on or near its nullspace. In any application with finite resolution or noise, 
instability can result if very different signals are mapped close together. It is therefore critical that the 
geometry of the subset M C ~$l N of signals of interest be maintained in the measurement space R M . 
This geometry preservation idea forms the basis for the following definition of a stable embedding by 
an operator: 
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Definition 11.1. A linear operator <I> provides a stable embedding of a subset M C R N with conditioning 
5m if far all pairs x\,X2 € M., we have 

(1 - 5 M ) < < (1 + §m) (1) 

Fl ~ ^112 

For a finite data cloud .M (i.e., \A4\ < oo), a stable embedding is established by the Johnson- 
Lindenstrauss (JL) lemma [20]. For many random operators $ [19,21], the JL lemma states that for 
a stable embedding of the set A4 to hold with high probability, the number of measurements need only 
scale with log(|.M|) and not with the size of the ambient signal space. In contrast, in CS the set M 
is comprised of all S-sparse vectors, M := {x G R N \ \\x\\q < S}, where ||x||o counts the number 
of non-zero entries in x. For this signal family, the notion of a stable embedding is given by the RIP, 
defined as: 

Definition II.2. A linear operator $ 6 K MxiV satisfies the Restricted Isometry Property of order S and 
conditioning 5 (or RIP-(S, S) in short) if for all x 6 with at most S non-zero entries, we have 

(1-<5)|M|| < < (l + <5)||x||l. 

Because the difference between S'-sparse signals is at most 2S'-sparse, an operator satisfying RIP-(25, 8) 
provides a stable embedding with conditioning 5 of the union of all S'-sparse subspaces of R N . 

B. Manifold-modeled Signals 

The sparsity and low-rank signal models that have gained significant attention in the signal processing 
community do not apply well to all signal families. Instead, many high-dimensional signals can be 
modeled as lying on low-dimensional submanifolds embedded in Euclidean space. One example class of 
such signals are parametric signals that are determined by a parameter G G, where 6 is a D-dimensional 
(typically D <C N) parameter space (which could be a D-dimensional manifold itself or simply a subset 
of R D ). More precisely, a parametric signal class can be written as M := {x e R N | x = f(9),6 € 6} 
where / : 6 — > R N is a smooth function. Examples of such parametric signals include a 1 -dimensional 
signal parameterized by a time delay (D = 1), a radar chirp characterized by its starting and ending 
time and frequency (D = 4), and images of an object articulated in space [3]. Not all manifold-modeled 
signals of interest can be parametrized. Nonetheless, low dimensional submanifolds have also been useful 
as approximate models for nonparametric signals classes such as images of human faces [4] or hand- 
written digits [22]. 
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Before discussing the stable embedding of manifolds, we establish some necessary notation and 
terminology. In the remainder of this paper, we consider M to be a Riemannian submanifold that inherits 
the canonical Euclidean metric from the ambient space. For a given point x on M embedded in R^, 
we let T X M. denote the tangent space of A4 at x. For a submanifold M. of dimension D, T X M can 
be thought of as a D-dimensional linear subspace of W N passing through the origin. We let d,M{x,y) 
denote the geodesic distance between two points x,y G M (i.e., the length of the shortest path between 
x and y along the submanifold). 

In this work, we consider two additional characterizations of a manifold that will be useful for 
describing certain local and global properties of the manifold. The first is the second fundamental form of 
the manifold (as defined in [23]), which provides a bound on the worse case curvature of any unit speed 
geodesic path along the manifold. We assume that the second fundamental form is uniformly bounded 
by some number \, where this upper bound is related to the condition number (also denoted by ^) as 
described in [24] and used in [8]. Appendix A also describes the consequences of this upper bound on 
the second fundamental form in greater detail. 

A second useful quantity is the geodesic regularity of a manifold. Let vol(l?) denote the volume of 
a set B, defined as vo\(B) = J B dv where dv is the volume element on B. Next, for a Riemannian 
manifold M, denote Bm(x, e) as the geodesic ball centered at x G M of radius e, Bm(x, e) := {p G 
■M | dM(p,x) < e}. Similarly, let B r d(x, e) be the Euclidean ball of radius e centered at x G R 15 , 
B k d(x, e) := {p G M D | \\p — x\\2 < e}. Then, the geodesic regularity R is defined as follows: 

Definition II.3. A D-dimensional Riemannian submanifold M of R N has geodesic regularity R at 

resolution eo if for every e < eo and for every x G M., 

vol(B R o(x,e)) < R D vol(B M (x,e)). 

We see that the geodesic regularity allows a uniform comparison of the geodesic and Euclidean balls 
of the same radius everywhere on the manifold. This comparison is related to a certain intrinsic curvature 
(in particular, the scalar curvature) of the manifold [25]. The consequences of the geodesic regularity 
R on the covering numbers of a manifold (to be described later) are described in Appendix B. As in 
[8], we shall subsequently neglect the minor dependence of the geodesic regularity R on the maximum 
resolution eo- 
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C. Related Work 

The work in this paper is closely related to [8] and [9], which both showed that with high probability, 
a random orthogonal projection G M. MxN will provide a stable embedding of a D-dimensional 
submanifold M C M. N whenever M scales linearly in D and logarithmically in certain other parameters 
of the manifold. We note that the main differences between these two works are that in [8], there is an 
additional dependence of M on log (A'"), and that the manifold characterizations in both papers are slightly 
different. In this paper, we adopt the manifold characterizations presented in [8]. The proof of each of 
these results requires a finite covering of points carefully chosen from the manifold and a covering of the 
tangent planes of those points. Using the JL lemma previously described, it then can be argued that, with 
high probability, a random orthogonal projection will provide a stable embedding of these points. Then, 
various geometric arguments allow one to conclude that the same orthogonal projection will provide a 
(slightly weaker) stable embedding of the entire manifold M.. 

In this work, we adopt the same general proof approach but replace the JL lemma for random 
orthoprojectors with a JL lemma for operators satisfying the RIP. The following theorem, adapted 
from [19], expresses this JL lemma: 

Theorem ILL Fix < p, e < 1 and suppose there is a finite set of points E C M N . Also suppose 
we have a matrix G M MxW satisfying the RIP of order S > 40 log {^y^ and conditioning S < |. 
Let £ G M. N be a Rademacher sequence (i.e., a sequence of i.i.d. equi-probable ±1 Bernoulli random 
variables), construct the diagonal Rademacher matrix := diag(^), and define <1> := <&D^ where 
<I> G C NxN is any unitary matrix. Then with probability exceeding 1 — p, we have for all x G E, 
(l-e)||x|||<||8x|||<(l + e)||x|||. 

In words, any operator satisfying the RIP can be used to approximately preserve the norms of any 
orthogonal transform of the signals in a given finite point cloud when the signs of the columns of the 
operator are randomly chosen. We remark that if the finite point cloud E is the set of all differences 
between points in another finite set M C M N , then a matrix G M MxAr satisfying the RIP of order 
S > 40 log (-^p-) (and conditioning 5 < f) in Theorem II. 1 can provide a stable embedding of M 
with high probability when the column signs of <3? are randomized. 
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III. Stable Manifold Embeddings 

Section III-A contains a statement of our main result, showing that any matrix that satisfies the RIP 
(i.e., provides a stable embedding for sparse signals) can be used to form a stable embedding of a 
manifold. Section III-B illustrates how this fact can be used to form stable manifold embeddings from 
several structured matrices that have been shown to satisfy the RIP. 

A. Manifold Embeddings from RIP Operators 

Our main contribution, showing that RIP operators can be used to form stable manifold embeddings, 
is captured in the following theorem: 

Theorem III.l. Let M be a compact D-dimensional Riemannian submanifold of M N with geodesic 
regularity R, volume V, and second fundamental form uniformly bounded by ^. Suppose $ G ]R MxAr 
is a matrix that satisfies RIP-(S,5), and let G R NxN be a diagonal Rademacher matrix. Denote 
<E> = <&D(^, where ^> G C NxN is any unitary matrix. Choose any conditioning 5m < 1 and failure 
probability p. If the RIP conditioning satisfies 5 < ^ an d the order S of the RIP satisfies 

'3528i?(/D72TT)(iV + l) 2 \ / 21(iV + l)\ (8V 2 

ll/J/llli I /-\ rr 111 11 nrr I 



S > 40 | 2D log | j + (2D + 1) log [1 + ] 4- log 

then with probability exceeding 1 — p, $ provides a stable embedding of M. with conditioning 5m- 

The proof of this theorem can be found in Appendix C. Note that the theorem statement gives a 
clear recipe for both creating a stable manifold embedding from an RIP operator as well as determining 
how many measurements are sufficient to guarantee the desired result. The main theorem statement 
relates the manifold properties to the required RIP order S, which can be related to the number of 
measurements by the original RIP proof for the operator in question (see also Section III-B). We note 
especially that the RIP order only scales linearly with the manifold dimension D and logarithmically with 
the ambient dimension N. This is especially important because most interesting RIP results also have a 
linear relationship between the RIP order and the number of measurements. Consequently, for such RIP 
results, this theorem allows the creation of a manifold embedding when the number of measurements 
scales linearly with the manifold dimension. Once an RIP operator is generated with a sufficient number 
of measurements, a manifold embedding can be created by simply randomizing the column signs of the 
operator. 
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Sometimes, such as in manifold learning algorithms (e.g., Isomap [12]), the main interest is in pre- 
serving the intrinsic geodesic distances between points of a data set lying on a submanifold of M. N 
instead of their extrinsic Euclidean distances. Prior work [8] has shown that operators that stably embed a 
manifold with respect to Euclidean distances are also stable embeddings with respect to geodesic distances. 
Therefore, stable embedding operators constructed according to Theorem III. 1 also provide geodesic stable 
embeddings, guaranteeing that manifold learning algorithms can be performed significantly faster in the 
compressed space without much degradation [13]. 

B. Manifold Embeddings from Structured Matrices 

As mentioned above, Theorem III. 1 allows us to construct operators providing stable manifold em- 
beddings from any operator that satisfies the RIP. We illustrate the implications of our result with a 
few notable examples below that establish stable manifold embeddings for operators with more structure 
than existing results on random orthoprojectors [8]. In the corollaries that follow, we assume that M 
is a compact D-dimensional Riemannian submanifold of R N with second fundamental form uniformly 
bounded by volume V, and geodesic regularity R. We also assume a fixed failure probability < p < 1 
and conditioning < 8m < 1- In what follows, we denote by Ci , C2, • • • universal constants that do not 
depend on the other variables in the corollaries and that differ from corollary to corollary. 

To begin, we consider a generalization of Gaussian random matrices to subgaussian random matrices 
(including Bernoulli, etc.). 1 

Corollary III.l (Subgaussian matrices). Suppose G I MxJV is a subgaussian random matrix with 
independent rows or columns following the construction in [26, Thm 5.65]. If 

then with probability greater than I — C2P, <J> = QD^ provides a stable embedding of M. with conditioning 

The proof of this corollary follows from the fact that such subgaussian random matrices satisfy RIP- 
(S, S) with high probability whenever M > C3 J? log (^) [2, 26]. A natural subset of subgaussian random 
matrices are matrices with i.i.d., symmetric, subgaussian entries of an appropriate subgaussian norm. 2 For 

'Subgaussian random variables are generalizations of Gaussian random variables; their definition can be found in [26]. 
2 The subgaussian norm of a subgaussian random variable is a generalization of the standard deviation of a Gaussian random 
variable. 
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this subset of matrices, both and have the same distribution and thus, the stable embedding for 
M. can actually use just the operator <E> rather than the operator <1>. This last observation formally proves 
a remark made briefly in [8] that stable manifold embeddings can also arise from random subgaussian 
matrices in addition to random orthoprojectors. 

To include a matrix with much more structure (i.e., not having i.i.d. entries), we also consider stable 
manifold embeddings by subsampled Fourier matrices. 



Corollary III.2 (Subsampled Fourier matrices). Suppose G R MxN . g Q su }j Sam pi ec i Fourier matrix 
whose M rows are chosen uniformly at random from the N x N DFT matrix. 3 If 

M ( Diog O + iog (7)) los4 (N) log< "" ) 

then with probability greater than 1 — C^p, $ = QD^ stably embeds M. with conditioning 5m- 

The proof of this corollary comes from the fact that subsampled Fourier matrices satisfy RIP-(S*, 5) 
with probability greater than 1 — p whenever M > C3JJ log 4 (N) log [27,28]. For dimensionality 
reduction problems where the data lies on a manifold, this result provides an efficient measurement 
scheme whereby the data is pre-multiplied by a Rademacher sequence and then M coefficients from the 
Fourier transform of the data are randomly chosen. 

In a similar direction, we also consider stable manifold embeddings from random convolutions. 

Corollary III.3 (Partial circulant matrices). Suppose <3? G ]\l MxN is a partial circulant matrix whose 
first row is made up of i.i.d. subgaussian random variables (see [16] for a detailed construction). If N 
is large enough and 



Ci JBH\ , . (V 



M>_^^ lo ^_J +log( _,] log . (iV) , 



P 



.At 



then with probability greater than 1 — C2P, 3> = <&D^ stably embeds A4 with conditioning 5m- 

Here, the proof follows from the fact that partial circulant matrices satisfy RIP-(S', 5) with probability 
greater than 1 — N~ ( log Ar )( lo § 2 s ) (hence the requirement for N to be large enough) whenever M > 
C 3 § log 4 (iV) (for N > S) [16]. This again affords us an efficient implementation of a dimensionality 
reduction scheme for data residing on or near a manifold. One would first pre-process the data by 
multiplying its entries with a pre-chosen random Rademacher sequence. Then, one would simply convolve 

3 In fact, this corollary works also for subsampled DTFT matrices [27]. 
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the processed data with a separate random subgaussian sequence and arbitrarily select M samples of the 
convolution output. 

In some situations, one may need to apply the convolution directly on the manifold-modeled data 
instead of using a pre-processing step (i.e., first multiplying by a diagonal Rademacher matrix). For 
this, consider the matrix $ := R n FD(.F H where F G C NxN is the DFT basis and R n G R MxN is a 
restriction operator that selects M entries of a length- N vector (or selects M rows from an N x N matrix). 
Now, this matrix follows our stable embedding construction as <3? := RqF is a subsampled Fourier matrix 
that satisfies the RIP (as long as M is large enough) and ^ := F H is orthonormal. Conveniently, FD^F H 
is a circular convolution matrix with being the (normalized) Fourier transform of the probe sequence 
of the convolution. Thus, the matrix <I> represents a subsampled convolution operation that can be used 
to stably embed manifold-modeled data. This idea is formalized in the follow corollary. 

Corollary III.4 (Random convolution matrices). Let Cg G £NxN ^ e a ran d om circulant matrix such 
that C ? := FD/:F H where D$ is a random diagonal Rademacher matrix and F is the DFT basis. Let 
SI C {1, 2, • • • , N} with = M be a subset selected uniformly at random. If 



M>^(Dlo/ RN 



+ log (^logVjlogtp" 1 ), 



*M V \t8 m . 

then with probability greater than 1 — C2P, $ := RqC^ stably embeds M with conditioning 5m- 

The proof for this corollary follows quickly from the fact that subsampled DFT matrices satisfy RIP- 
(S,S) with high probability whenever M > C 3 § log 4 (iV) ^(p^ 1 ) [27,28]. 

To address the constraint that some systems can only take localized measurements of the signal, we 
also consider operators represented by a Distinct Block Diagonal (DBD) matrix $ G jg>MJx7Vj ^ at j s 
non-zero only on the diagonal blocks, 

$ = 

V * J 

The blocks <I>j G M Mx7V on the diagonal consist of i.i.d. subgaussian random variables (that are also 
independent across the blocks). The following corollary establishes how such matrices can be used to 
stably embed manifold-modeled data. 

Corollary III.5 (DBD matrices). Let $ G R MJxNJ be a DBD matrix described above, and let C$ G 
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C NJxNJ be the circulant matrix as described in Corollary 111.4. If NJ is large enough and 

then with probability greater than 1 — C^p, 3> = <&C^ stably embeds M. with conditioning 5m- 

The proof of this corollary follows quickly from the fact that a DBD matrix satisfies RIP-(5, S) with 
probability exceeding 1 — C^NJ)^ 1 (hence the requirement that NJ is large enough) for frequency 
sparse signals (i.e., satisfies RIP) whenever MJ > C^S log 6 (N J) [17]. This corollary states that if 
we pre-process the data by convolving it with a random Rademacher probe, then a block-diagonal matrix 
(having significantly many more zeros than non-zeros) can stably embed a manifold. 

As a last example, the following corollary indicates how one might be able to use a deterministic 
matrix construction to stably embed manifold-modeled data. 

Corollary III.6 (Deterministic binary matrices). Suppose G {0,l} MxN is a deterministic matrix 
following the construction given in [18]. If 



2 

.2/ 



then with probability greater than 1 — p, $ = <J>L>£ provides a stable embedding of M. with conditioning 
5 M - 

Again, this corollary follows from the fact [18] that such matrices satisfy RIP-(S', 8) whenever M > 
C^fr log 2 (-/V). Despite the additional number of required measurements, deterministic matrices can be 
of interest to the CS community as it is an NP-hard problem to verify whether a randomly constructed 
matrix satisfies the RIP [29]. 

IV. Discussions 

In this paper, we showed that all measurement operators <1> satisfying the RIP can be used to obtain 
a stable embedding of a manifold. Moreover, we used this main result to demonstrate several specific 
examples of stable manifold embeddings that represent efficient dimensionality reduction schemes and 
operators that model constraints on the measurement process. These include subsampled Fourier matrices, 
random convolution matrices, block diagonal matrices, and deterministically constructed matrices. For 
each of these operators, we also provided the requisite number of measurements sufficient to ensure a 
stable embedding of the manifold with high probability and with a pre-determined conditioning. This 
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result represents a combination of two directions of recent interest in the CS community: structured 
measurement matrices and the development of low-dimensional signal models beyond the canonical 
sparsity model. 

While our main theorem provides a general way to construct manifold embeddings from RIP operators 
by paying reasonable penalties in the number of measurements, there is room for this result to be 
improved. Specifically, Theorem III. 1 could be strengthened by removing the logarithmic dependence 
on the ambient dimension N from the required RIP order S. This reduction by a factor of log(iV) 
would come at the cost of the proof requiring much more sophisticated machinery involving chaining 
arguments as described in [9, Lemma 3.1]. 4 We have chosen to present the current result using a simpler 
proof technique because even with the improvement described above, the final result would still require 
a number of measurements that depends on log(iV) due to this factor arising in the RIP requirements for 
known matrices (as demonstrated in the corollaries of Section III-B). Therefore, while this more complex 
proof technique could reduce the dependence on log(iV), it could not entirely remove this dependence 
on the ambient dimension. 
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Appendix A 

The Second Fundamental Form of a Submanifold 

The second fundamental form [23] of a submanifold A4 at a point p along the normal vector r\ G 
(TpM) 1 - (i.e., the orthogonal component of T P M in R N ) for a tangent vector v G T P M is denoted by 
I v (v,p), and it tells us how the submanifold is "folded" in the ambient Euclidean space. We say that the 
second fundamental form of M is uniformly bounded by ^ if 5 

sup \\(v,p)\ < -, 

veX(M), P eM,'ne(T P M) 1 - T 

where X(M) is the space of vector fields on M. In words, this means that the second fundamental 
form I r) (v,p) is bounded by ^ for all vector fields v, for all points p on the manifold, and for all normal 
vectors r? in (T P M) ± . 

The following lemma lists some consequences of the uniform boundedness of the second fundamental 
form on certain geometric properties of the manifold that will be useful to our analysis. 

Lemma A.l. Suppose a submanifold M. C M N has second fundamental form uniformly bounded by ^. 
Let p,q G M. be two distinct points. Then, we have the following three properties of the manifold. 

1) {Curvature) If j(t) denotes a unit speed parameterization of the geodesic path joining p and q, 
then ||7"(£)||2 < \- Moreover, denoting [i := dM{p,l), we have q — p = 7(/i) — 7(0) = /ir/(0) + r 

a 2 

with \\r\\2 < fc. 

2) (Twisting of Tangent Spaces) Suppose dM(p,q) < t. Pick u G T P M. and let v G T q M. be the 
parallel transport* of u into T q M. Then, (u, v) > 1 - dM ^ q \ 

3) (Self-avoidance) Suppose \\p— q\\ 2 < \. Then, \p — q\i > dM{p,q) — dM ^! q ^ . As a corollary, we 
also have dM(p, q) < T ~ t\J 1 - 2U p ~ g H 2 . 

The proofs of these properties are simple consequences of the uniform boundedness of the second 
fundamental form and follow the proofs of similar propositions in [24, Section 6]. The primary difference 
is that the proofs in [24] use the condition number as a starting point instead of the bound on the second 

5 We bound the second fundamental form by a fraction i to make explicit the correspondence to the condition number 
represented by the same fraction. 

6 Suppose 7(f) denotes a unit speed parameterization of the geodesic path joining p and q. By parallel transport [23], we 
mean a vector field v(t) defined along ^(t) such that v(0) = u, v(fi) = v, ||u(i)||2 = MI2, and {v(t), 7' (t)) = (it,7'(0)), 
where the last two conditions mean that v(t) maintains a constant length and angle with respect to the path 7(t). 
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fundamental form. The first property says that the worst case curvature of any unit speed geodesic path 
along the manifold is bounded by ^. The second property states that for small geodesic distances, the 
tangent spaces do not "twist" too much from one another. Thus, if we compare a tangent vector to its 
parallel counterpart in another nearby tangent space, the angle between them is small. The last property 
states that for points on the submanifold close together in Euclidean space, their geodesic and Euclidean 
distances do not differ much. Negating the statement, we see that two points with large geodesic distance 
cannot be arbitrarily close in Euclidean space. 

Appendix B 
Covering Number of a Manifold 

The geodesic regularity R of a manifold M allows us to quantify the geodesic covering number of 
the manifold (i.e., how many geodesic balls of a certain radius are needed to cover the whole manifold). 
More concretely, we say that a set C is an (e, c?x)-cover for M if M C (JbeC ^M(b, e) where we recall 
that Bjw(b,e) is the geodesic ball of radius e centered at b. This implies that for every x € A4, we 
can find a b G C such that dj^(b,x) < e. The (e, fix) -cover C with the minimal cardinality is denoted 
by C(M, djw, e) and the cardinality of C (M, d_M, e ) is called the (e, (1m) -covering number of M 
or simply the geodesic covering number. The following lemma gives an upper bound on the geodesic 
covering number of a manifold. 

Lemma B.l. The (e^dj^-covering number of a D -dimensional Riemannian submanifold A4 C l w is 
bounded by 

\C(M,d M ,e)\< 1 



ini x£M vol(B M (x, f)) ! 
where V := vol(Ai). If M. has geodesic regularity R, then 



^) D (VWT)% 



\C (M, d M , e)\ < vv 7 K eD . (2) 

The proof of this lemma follows the arguments of the proof of [27, Proposition 10.1]. We remark that 
the definition of an equivalent geodesic covering regularity in [8] corresponds to ^= appearing in (2). 

We will also need to cover subsets of M. D with Euclidean balls (instead of geodesic balls as in the 
previous lemma). Thus, we say that the finite set C (M, || • H2, e) (of minimal cardinality) is an (e, || • H2)- 
cover for a subset M C R D if M C Ufcec(.M, |H| 2 , e) B Rn(b, e). 
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Appendix C 



Proof of Theorem III. 1 



Mathematically, if we introduce some particular notation, the stable embedding statement (1) can 
be presented in an equivalent way that is more useful for the desired proof. First, define the operator 
U : R N — {0} — > S^ -1 that takes a non-zero vector and projects it onto the unit sphere (i.e., for 
any x £ R N - {0}, U(x) := ^-). U can also act on a subset of R N such that KMC R N , then 
U(M) := | | £ € — {0} j. Next, we define the difference between any two subsets A — B (with 
A, B C R N ) as the set comprised of pairwise differences between the elements of the sets, A — B := 
{a — b | a G A, b G B}. Finally, for a finite subset M of R N , \M\ denotes its cardinality. 

Suppose M is the Riemannian submanifold considered in Theorem III. 1 and define the set of chords 
of M (i.e., the set of all normalized difference vectors in M) as 



8m- In other words, <E> provides a stable embedding of M if and only if $ approximately preserves the 
norms of all elements in U(M — M). We will use this equivalence for the proof of Theorem III. 1 . 

The proof of Theorem III. 1 follows very closely the proof technique of [8] and is basically comprised 
of three steps. The first step involves judiciously choosing a generalized covering set B of the manifold 
M using a collection of points on the manifold and their corresponding tangent planes. Lemma C. 1 then 
shows that every point of U(M — M) can be approximated by some point in U(B — B). The second 
step (encapsulated by Lemma C.2) then applies the JL lemma for RIP operators (i.e., Theorem II. 1) to 
obtain an approximate norm preservation of all elements of U(B — B). Finally in Section C-C, we extend 
this approximate norm preservation to all points on U (M — M) via simple geometric arguments. As 
described, the proof technique here distinguishes from that of [8] mainly in the separation of the stable 
embedding operator from the covering of the manifold. 

A. Covering U(M - M) 

In this section, we construct a set B and show in Lemma C. 1 that U(B — B) is a cover of U(M —M). 
Let A = A(T) := C (M, dM, T) for some T < r be the (T, d^t) -cover of M of minimum cardinality. 
For any x € M, we can find an a G A such that c?x(a, x) < T. Define a generalized covering set B of 




Then, $ provides a stable embedding of M with conditioning 5m if an d only if swp xeU ^ M -M) II^Mli — 1 — 
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the manifold M as 

B = B{T)= \J{a + T a M(T)}, 

aeA 

where T a M{T) := {u G T a M \ \\u\\2 < 7 1 } refers to all tangent vectors of M at the point a whose 
lengths are less than T. B is called a generalized covering set as it is a union of (subsets of) afflne 
D-dimensional planes of M. N (i.e., B is not a finite set). 

The goal of this section is to show that U(B — B) is a suitable cover of U(M — M) as detailed in 
the following lemma: 



Lemma C.l. Let B be defined as above. ForT < t, set e(T) := 4Jp Then, U(B-B) is an (e(T), || - 1| 2)- 
cover of U (M. — M). In other words, for every u G U {M. — Ai), we can find a b G U{B — B) such 
that \\u - b\\ 2 < e(T). 

Proof: To prove this lemma, we break the set of chords U(M — M) into sets of "long" and "short" 
chords which we will cover separately. The sets of short and long chords (delineated by Euclidean distance 

T- 

2 • 



T T ) are defined by: 



U S {M-M) := — I xi,x 2 e M,0 < \\ Xl - x 2 \\ 2 < ^ 

U l {M-M) := — I x u x 2 G M, ||xi -X2II2 > ^-1 , 

I Fi - X2II2 2 J 

and 17(M - 7W) = ?7 S (7W - M) U E/'(M - M). 

Let us start with the cover of II s {M. — M) where, due to the locally Euclidean structure of manifolds, 

the short chords in U S (A4 — M) can be approximated by tangent vectors of the manifold. Pick an element 

\\xl-x2W2 °^ U S (M - M) where by definition ||xi — x 2 ||2 < \- From Lemma A.l, \x\ — x 2 \ 2 < ^ < | 

(since we assume T < r) implies that 



, 2||xi-X 2 ||2 . (-, 2||xi-X 2 ||2i ,| n 
dM(Xi,x 2 ) < T - T\ I — < T- T 1 — = 2||xi - X2H2 < T. (3) 



r \ r 

Now, let fi := c?x(xi,X2) and let j(t) be the unit-speed geodesic parameterization from x\ to X2 where 
7(0) = xi, 7(/u) = X2, and 7'(0) G U(T Xl M). From Lemma A.l, we have 

xi-x 2 = - 7(0) = M7'(0) + r, (4) 

with ||r||2 < fp. Let a G i be the closest geodesic covering point to xi (so that dx(a, xi) < T) and 
let b G U(T a M) be the parallel transport of 7'(0) onto 7;A^. First, b G - B) by definition of the 
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set B and second, Lemma A.l says that (Y(0),6) > 1 - dM ^ x ^ > 1 - T., since d M {a,x{) <T < r. 
Thus, 



IT 

117(0) - b\\l = ||V(0)||I + ||b||i - 2< 7 '(0), 6) = 2(1 - < 7 '(0),&» < — . 
We now show that b is indeed close to the short chord w^E§jr 2 by combining (4) and (5): 



(5) 



\X\ - X 2 \\2 



\Xl - X 2 \\2 



y( )-6+f M -i) 7 '(o) + 

VIf1-^2||2 / 

/I 



< 



2T 




+ 



Fi - M 2 



-1 + 



Fl - X 2 2 



/X 



bi-i2 2 2r 



(6) 



To remove the dependence of (6) on ||xi — X2W2, we use Lemma A.l to obtain 
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|xi - X2II2 > /U 



/i 



< 



lFi - x 2 \\2 

where we used the inequality < (1 + 2a) whenever < a < \ (this is true since ^ < ^ < \). 
Applying this to (6) and applying /i < T obtained earlier in (3), we obtain 



X\ - X2 



< 



vf+f+( 1+ f)^V^>< 



\xi - X 2 \\2 

where we used the fact that -^<-p<^/|-<l. This proves that for every element of U S {M — M), we 
can find an element b G U(B — B) that is within e\(T) of it. Thus, U(B — B) is an (ei(T), || • ||2)-cover 
of U S (M-M). 



Let us now move on to covering U l (M — M). Pick an element 



of U\M-M). For each 



£1-2:2112 

Xi, for i = 1,2, choose its closest geodesic covering point a, £ A so that ^ := dx(aj,Xj) < T. Let 
7i(i) be the unit-speed geodesic parameterization from a« to Xi, so that 7j(0) = a^, 7i(/Uj) = x\, and 
7-(0) G U(Ta t M). From Lemma A.l, we have Xi -ai = 7i(Mi)-7i(0) = V-il'ii®) + n, with ||rj|| 2 < ^. 
Define bi = on + ^7^(0) where it is clear that Xi — bi = ri and bi G {a^ + 7^.M(T)} C 5. We will use 
\\b1-b2h ^U(B — B) as a covering point near to ^i-x 2 2 \\ 2 ' F° u °wing [9], we have 
x\ — X2 b\ — b 2 



|Z1-£2||2 ||6l-62||2 



(Xl - X 2 ) - (61 - 6 2 ) (frj - &2XH&I - &2II2 - \\X\ ~ X2W2) 



< 



\\Xl - X2\\2 

(xi - x 2 ) - (h - b 2 ) 



\x\ — X2II2II01 — fel 



Xl - X 2 2 



+ 



\h - b 2 \\ 2 ~ \\xi - X2W2) 



\xi - X2\\ 2 \\bi - 62 1 1 2 
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We will calculate each of the terms separately. For the first term, we see that 



(xi - x 2 ) - (h - b 2 ) 



xi ~ X 2 \\2 



(xi - 61) - (x 2 - b 2 ) 



< 



< 



\xi - X 2 \\2 
\xi - x 2 \\ 2 \ 2r 2r 



T\\Xl — X 2 \\ 2 



For the second term, we have 

(&i - b 2 )(,\\h - b 2 \\ 2 - \\xi - X2W2) 



\xi - x 2 \\ 2 \\bi - 62 1 1 2 



\x\ - X 2 \\2 



\bi - b 
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< 





\\ X 1 


- x 2 \\ 2 




II (XI 


- x 2 ) 


-(61- 






\\x\ 


- X2W2 




11(^1 


-61) 


- (x 2 - 





\Xl - X 2 \2 



rp2 



< 

T\\Xi - X 2 \\ 2 ' 

where we used the reverse triangle inequality in the second line. Now, our definition of long chords 
implies that \\x\ — x 2 \\ 2 > \. Therefore, 

x\ - x 2 b\- b 2 



\xi - x 2 \\ 2 \\h - b 2 \\ 2 
Thus, U{B - B) is an (e 2 (T), || • || 2 )-cover of U l (M - M) 

Putting everything together, since e\{T) 
(4\/f , II • || 2 )-cover of U{M - M). 



AT 

< -=:e 2 (T). 



4a/? > 4^ = e 2 (T), we have that U{B - B) is a 



B. Applying the JL Lemma 

We want to use U(B — B) as a proxy for U (M. — M.) for applying Theorem II. 1. However, U(B — B) 
is not just a finite collection of points and thus Theorem II. 1 cannot be applied directly. Fortunately, it 
is well-known that unit spheres on planes (or affine planes) can be well-covered by a finite collection 
of points. Indeed, as a corollary to Lemma B.l (see also [27]), the (e, || • (^-covering number of a 
D-dimensional sphere is (l + 

U(B — B) can be divided into two sets of elements, namely: 

1) B 1 := U {{j aeA {T a M(T) - T a M(T)}) = U {{J aeA T a M), and 

2) B 2 := U (\J ai , a2 eA,a^a 2 " <*) + (T ai M(T) - T a2 M(T))}). 
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The set B\ is comprised of \A\ D-dimensional unit spheres. From our earlier discussion, we know that 
each unit sphere can be (e, || • ||2)-covered by at most (1 + |) D points. Thus, \C (B\, \\ ■ || 2 , e) | < 
\A\(1 + |) D . The set B 2 is the projection onto the unit sphere (in R N ) of not more than \A\ 2 subsets 
of affine planes where each affme plane is contained in a linear subspace of dimension 2D + 1. Thus, 
\C(B 2 , H| 2 , € )\<\A\ 2 (l + lf D + 1 . 

Define the collection of points E(e) := C (B 1} \\ ■ || 2 , e) U C (B 2 , 
cussion, the cardinality of E(e) is bounded by 



I2, e). From our previous dis- 



\E(e)\<\A\ 1 + 



v 



2\ 2D+1 
1 + -J <2\A\' 



2D+1 



(7) 



By construction, for any b G U(B — B), we can find an e G E(e) such that \\b — e|| 2 < £• With the aid 
of E(e), we can show the stable embedding of U(B — B) by the operator $ defined in Theorem III. 1 . 



Lemma C.2. Choose any failure probability p and conditioning 5' M < |. Set the covering resolution e in 
the set E(e) to e = ;^rj. Suppose we have a matrix satisfying the RIP of order S > 40 log ( 4 ^ 6 ^ ) and 
conditioning 5 < Then, with probability exceeding 1 — p, the matrix $ := <&D^ is a (non-squared) 
stable embedding 1 of B with conditioning \5' M (i.e., svLVbeU(B-B) ll^lk — 1 



Proof of Lemma C.2: Fix p < 1 and S' M < §. If $ satisfies RIP- (5, 5) with S > 40 log ( 



*\E(e)\ 



(with e to be defined later) and 5 < -4 1 as assumed in Lemma C.2, then Theorem II. 1 states that 



with probability exceeding 1 — p, sup ee £( e ) 



|$e| 



1 



< sup eeB(e) 



|$e|||-l 



< 5' M . For a fixed 



b £ U(B — B), find its nearest covering point e G £?(e) such that \\b — e|| 2 < £• Then, we have 

||$6|| 2 < ||$e|| 2 + ||$(6 - e)|| < (1 + 8' M ) + ||$|| 2 \\D^(b - e)|| 2 . (8) 

Now, it is easy to show that for a matrix $ G C MxN that satisfies RIP^^J), ||$|| 2 < (f + l) (1 + 5). 
Applying this fact to (8), we have 

m\ 2 < (1 + 5> M ) + + l) (1 + 5) e < (1 + S' M ) + ( N + l)(l + 6. 

S' 

To remove the catastrophic dependence on (N + 1), set e = ]y+T- Using this choice of e, we have 



l«i»ftll 2 <(i + fi + ^i 



<5^ < 1 + 2<5'm + 



>/!;2 < 1 4- 9 A' 



'Squared and non-squared stable embeddings differ only by a small constant in their conditioning. To be more concrete, suppose 
Ccl". Then it can be shown that sup ceC |||<E>c||2 — l| < sup ceC |||<E>c||2 — lj. Furthermore if sup c6C . j 1 1 <E>c 1 1 2 — lj < 1, 
then it can be shown that sup cgC ||"f?c||i — 1 < 3sup ceC H'J'cl^ — 1 . 
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where we used the fact that S' M < | < 1. Using the same steps for the lower bound, we obtain 



\\$b\\ 2 > ||$e|| 2 - Hb-e) > (1 - 5' M ) - {N + 1) 1 + 



5', 



M 



2 V 1 / 1 

Since the upper and lower bounds coincide, and they are valid for all b G U(B — B), we arrive at our 
required conclusion. ■ 



C. Synthesis 

Finally, it remains to extend the stable embedding from U(B — B) to U (M. — M). From Lemma C. 1, 
for any u G U(M - M), we can find a 6 G U(B - B) such that \\b - u\\ 2 < e(T) with e(T) = . 
Using Lemma C.2 (with p fixed and 5^ < | to be defined later), triangle inequalities, and the fact that 
||$|| 2 < (f + 1) (l + %),we have 



|<Hh < ||*6||2 + \\<$>h\\D^{b - u)\\ 2 < (1 + ^-8' M ) + (l + ^ ) ( V + ! i m 7 ') 



(9) 



Set T such that e(T) 



iV+1 



. By using the formula for e(T), we have that T 



check that T < r, which fulfills the condition of Lemma C.l. Plugging this choice of e(T) into (9), we 
get 



\$u\\2 <(! + %)+(! + 



<5' 



<*m < 1 + ^ 



where we used the fact that (5^ < | < 1. For the lower conditioning bound, we use the same estimates 
to arrive at 



|8u|| 2 > ||$6|| 2 - ||$|| 2 ||£> € tf (6 - u)\\ 2 > (1 - -5' M ) - ( 1 + ° ' 



M 



S' M >1- Is' 



M- 



Since the upper and lower bounds coincide, we have via the squared and non-squared conditioning bounds 



sup 

ueU(M-M) 



|$u||l 



< 3 sup 

ueU(M-M) 



|$u|| 



<y<W 



It remains to do some bookkeeping. First, given a predetermined stable manifold embedding con- 
ditioning 5m < 1, set 5' M = ^5m- It is clear that this choice of 5' M validates the assumption 



that 5' M < | in Lemma C.2, and we have svlV u &u(M-M) 



\&u\\l 



< 5m which is what we are 



trying to prove. Next, according to the JL lemma for RIP operators (Lemma C.2), the RIP conditioning 
for the matrix $ needs to satisfy 5 < - 



This is the condition for the RIP conditioning in 



Theorem III. 1 . Finally, according to the JL lemma for RIP operators (Lemma C.2), the RIP order needs 
to satisfy S > 40 log ^ 4 I- E ('^/( JV+1 ))I ^ , For this, we need do some work. First, using (7), we have 
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21(jV+l) \ 



2D+1 



. Now | A | depends on the geodesic covering 



resolution T, which was set to be T = ^7^jp = i764(1v+i) 2 • ^ Lemma B.l, which gives the geodesic 
number of a manifold with geodesic regularity R, we have 

(M) D (y«72TT)M 



log (|A|) < log 



log 



J 



( ( 3528i?\ D 
I ^ ) 



D 



D/2 + l) (N + 1) 2D V^ 



\ 



X2D D 
°M T 



D\og 



'3528i? (y/ D/2 + 1) (N + 1) 



V^S 2 m t 

Putting everything together, the order S of the RIP of the matrix $ must satisfy 



/ 



+ \og(V). 



S > 40 2D log 



^3528i? (y/D/2 + 1) (N + 1) 



+ (2D + l)log 1 + 



21(iV + l) 



This concludes the proof of Theorem III. 1 . 



+ log 



'8V 2 
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